{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6888a2ad",
   "metadata": {},
   "source": [
    "\n",
    "# Understanding and Using Embeddings\n",
    "\n",
    "Welcome to this hands-on workshop where we will explore **embeddings** and their importance in Visual AI. \n",
    "Embeddings play a crucial role in **image search, clustering, anomaly detection, and representation learning**.\n",
    "In this notebook, we will learn how to generate, visualize, and explore embeddings using **FiftyOne**.\n",
    "\n",
    "![using_embeddings](https://cdn.voxel51.com/getting_started_manufacturing/notebook2/using_embeddings.webp)\n",
    "\n",
    "## Learning Objectives:\n",
    "- Understand what embeddings are and why they matter in Visual AI.\n",
    "- Learn how to compute and store embeddings in FiftyOne.\n",
    "- Use embeddings for similarity search and visualization.\n",
    "- Leverage FiftyOne's interactive tools to explore embeddings.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cdd4c2d9",
   "metadata": {},
   "source": [
    "\n",
    "## What Are Embeddings?\n",
    "\n",
    "Embeddings are **vector representations** of data (images, videos, text, etc.) that capture meaningful characteristics. \n",
    "For images, embeddings store compressed feature representations learned by deep learning models. These features enable tasks such as:\n",
    "- **Similarity Search**: Find images that are visually similar.\n",
    "- **Clustering**: Group images with shared characteristics.\n",
    "- **Anomaly Detection**: Identify outliers in datasets.\n",
    "- **Transfer Learning**: Use learned embeddings to improve other AI tasks.\n",
    "\n",
    "### Further Reading:\n",
    "- [Introduction to Embeddings](https://www.tensorflow.org/text/guide/word_embeddings)\n",
    "- [Feature Representations in Deep Learning](https://pytorch.org/tutorials/beginner/nlp/word_embeddings_tutorial.html)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "34dbec8d",
   "metadata": {},
   "source": [
    "\n",
    "## Generating Embeddings in FiftyOne\n",
    "\n",
    "FiftyOne provides seamless integration for embedding computation. \n",
    "You can extract embeddings using pre-trained deep learning models (such as CLIP, ResNet, or custom models) and store them in FiftyOne datasets.\n",
    "\n",
    "### How It Works:\n",
    "1. Load a dataset in FiftyOne.\n",
    "2. Extract embeddings from a model.\n",
    "3. Store and visualize embeddings.\n",
    "\n",
    "**Relevant Documentation:** [Computing and Storing Embeddings](https://voxel51.com/docs/fiftyone/user_guide/brain.html#computing-embeddings)\n",
    "\n",
    "<div style=\"border-left: 4px solid #3498db; padding: 6px;\">\n",
    "<strong>Note:</strong> You must install the `umap-learn>=0.5` package in order to use UMAP-based visualization. This is recommended, as UMAP is awesome! If you do not wish to install UMAP, try `method='tsne'` instead\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install fiftyone huggingface_hub gdown umap-learn torch torchvision"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Select a GPU Runtime if possible, install the requirements, restart the session, and verify the device information."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "\n",
    "def get_device():\n",
    "    \"\"\"Get the appropriate device for model inference.\"\"\"\n",
    "    if torch.cuda.is_available():\n",
    "        return \"cuda\"\n",
    "    elif hasattr(torch.backends, \"mps\") and torch.backends.mps.is_available():\n",
    "        return \"mps\"\n",
    "    return \"cpu\"\n",
    "\n",
    "DEVICE = get_device()\n",
    "\n",
    "print(f\"Using device: {DEVICE}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Download dataset from source"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import fiftyone as fo # base library and app\n",
    "import fiftyone.utils.huggingface as fouh # Hugging Face integration\n",
    "\n",
    "dataset_name = \"MVTec_AD\"\n",
    "\n",
    "# Check if the dataset exists\n",
    "if dataset_name in fo.list_datasets():\n",
    "    print(f\"Dataset '{dataset_name}' exists. Loading...\")\n",
    "    dataset = fo.load_dataset(dataset_name)\n",
    "else:\n",
    "    print(f\"Dataset '{dataset_name}' does not exist. Creating a new one...\")\n",
    "    # Clone the dataset with a new name and make it persistent\n",
    "    dataset_ = fouh.load_from_hub(\"Voxel51/mvtec-ad\", persistent=True, overwrite=True)\n",
    "    dataset = dataset_.clone(\"MVTec_AD\")\n",
    "\n",
    "dataset_emb = fo.load_dataset(\"MVTec_AD_emb\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Alternative - For Colab users\n",
    "We can download the file from Google Drive using `gdown`\n",
    "\n",
    "Let's get started by importing the FiftyOne library, and the utils we need for a COCO format dataset, depending of the dataset format you should change that option. [Supported Formats](https://docs.voxel51.com/user_guide/dataset_creation/datasets.html#supported-formats)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import gdown\n",
    "\n",
    "url = \"https://drive.google.com/uc?id=1nAuFIyl2kM-TQXduSJ9Fe_ZEIVog4tth\" \n",
    "gdown.download(url, output=\"mvtec_ad.zip\", quiet=False)\n",
    "\n",
    "!unzip mvtec_ad.zip\n",
    "\n",
    "import fiftyone as fo \n",
    "\n",
    "dataset_name = \"MVTec_AD\"\n",
    "\n",
    "# Check if the dataset exists\n",
    "if dataset_name in fo.list_datasets():\n",
    "    print(f\"Dataset '{dataset_name}' exists. Loading...\")\n",
    "    dataset_emb = fo.load_dataset(dataset_name)\n",
    "else:\n",
    "    print(f\"Dataset '{dataset_name}' does not exist. Creating a new one...\")\n",
    "    dataset_ = fo.Dataset.from_dir(\n",
    "        dataset_dir=\"/content/mvtec-ad\",\n",
    "        dataset_type=fo.types.FiftyOneDataset\n",
    "    )\n",
    "    dataset_emb = dataset_.clone(\"MVTec_AD\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(dataset_emb)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5596cbed",
   "metadata": {},
   "source": [
    "\n",
    "## Exploring and Visualizing Embeddings\n",
    "\n",
    "Once embeddings are generated, we can visualize them using **dimensionality reduction techniques** like:\n",
    "- **t-SNE (t-Distributed Stochastic Neighbor Embedding)**\n",
    "- **UMAP (Uniform Manifold Approximation and Projection)**\n",
    "\n",
    "These methods reduce the high-dimensional feature space into 2D/3D representations for interactive visualization.\n",
    "\n",
    "**Relevant Documentation:** [Visualizing Embeddings in FiftyOne](https://docs.voxel51.com/brain.html#visualizing-embeddings), [Dimensionality Reduction](https://docs.voxel51.com/brain.html#visualizing-embeddings)\n",
    "\n",
    "<div style=\"border-left: 4px solid #3498db; padding: 6px;\">\n",
    "<strong>Note:</strong> Be patient, it will take about 5-10 minutes to compute the embeddings.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Compute embeddings for MVTec AD using CLIP\n",
    "\n",
    "import fiftyone.brain as fob\n",
    "import fiftyone.zoo.models as fozm\n",
    "\n",
    "# Load a pre-trained model (e.g., CLIP)\n",
    "model = fozm.load_zoo_model(\"clip-vit-base32-torch\")\n",
    "\n",
    "fob.compute_visualization(\n",
    "    dataset_emb,\n",
    "    model=model,\n",
    "    embeddings=\"mvtec_emb\",\n",
    "    brain_key=\"mvtec_embeddings\",\n",
    "    method=\"umap\",  # Change to \"tsne\" for t-SNE\n",
    "    num_dims=2  # Reduce to 2D\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset_emb.reload()\n",
    "print(dataset_emb)\n",
    "print(dataset_emb.last())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d486ee56",
   "metadata": {},
   "source": [
    "\n",
    "## Performing Similarity Search with Embeddings\n",
    "\n",
    "With embeddings, we can search for visually similar images by computing the nearest neighbors in the embedding space.\n",
    "FiftyOne provides built-in tools to perform **similarity search** efficiently.\n",
    "\n",
    "**Relevant Documentation:** [Performing Similarity Search](https://voxel51.com/docs/fiftyone/user_guide/brain.html#similarity-search)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "session = fo.launch_app(dataset_emb, port=5152, auto=False)\n",
    "print(session.url)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![similarity](https://cdn.voxel51.com/getting_started_manufacturing/notebook2/similarity.webp)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "186f702f",
   "metadata": {},
   "source": [
    "\n",
    "### Next Steps:\n",
    "Try using different models for embedding extraction, explore clustering techniques, and test similarity search with your own datasets! 🚀\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "manu_env",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.17"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
